Multi-Elimination ILU Preconditioners on GPUs

نویسندگان

  • Dimitar Lukarski
  • Hartwig Anzt
  • Stanimire Tomov
  • Jack Dongarra
چکیده

Iterative solvers for sparse linear systems often benefit from using preconditioners. While there are implementations for many iterative methods that leverage the computing power of accelerators, porting the latest developments in preconditioners to accelerators has been challenging. In this paper we develop a selfadaptive multi-elimination preconditioner for graphics processing units (GPUs). The preconditioner is based on a multi-level incomplete LU factorization and uses a direct dense solver for the bottom-level system. For test matrices from the University of Florida matrix collection, we investigate the influence of handling the triangular solvers in the distinct iteration steps in either single or double precision arithmetic. Integrated into a Conjugate Gradient method, we show that our multi-elimination algorithm is highly competitive against popular preconditioners, including multi-colored symmetric Gauss-Seidel relaxation preconditioners, and (multi-colored symmetric) ILU for numerous problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

The application of the finite element method for the numerical solution of partial differential equations naturally leads tolarge systems of linear equations represented by a sparse system matrix A and right hand side b. These systems are commonly solved using iterative solvers, particularly Krylov subspace methods, which are typically accelerated using preconditioners to obtain good convergenc...

متن کامل

Enhanced Parallel ILU(p)-based Preconditioners for Multi-core CPUs and GPUs – The Power(q)-pattern Method

Application demands and grand challenges in numerical simulation require for both highly capable computing platforms and efficient numerical solution schemes. Power constraints and further miniaturization of modern and future hardware give way for multiand manycore processors with increasing fine-grained parallelism and deeply nested hierarchical memory systems – as already exemplified by recen...

متن کامل

Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

Based on the premise that preconditioners needed for scientific computing are not only required to be robust in the numerical sense, but also scalable for up to thousands of light-weight cores, we argue that this two-fold goal is achieved for the recently developed self-adaptive multi-elimination preconditioner. For this purpose, we revise the underlying idea and analyze the performance of impl...

متن کامل

Accelerating Preconditioned Iterative Linear Solvers on Gpu

Linear systems are required to solve in many scientific applications and the solution of these systems often dominates the total running time. In this paper, we introduce our work on developing parallel linear solvers and preconditioners for solving large sparse linear systems using NVIDIA GPUs. We develop a new sparse matrix-vector multiplication kernel and a sparse BLAS library for GPUs. Base...

متن کامل

ILU and IUL factorizations obtained from forward and backward factored approximate inverse algorithms

In this paper‎, ‎an efficient dropping criterion has been used to compute the IUL factorization obtained from Backward Factored APproximate INVerse (BFAPINV) and ILU factorization obtained from Forward Factored APproximate INVerse (FFAPINV) algorithms‎. ‎We use different drop tolerance parameters to compute the preconditioners‎. ‎To study the effect of such a dropping on the quality of the ILU ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014